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Abstract 

The paper is devoted to statistical analysis of zero intelligence 
models of continuous double auction, in particular to an estimation 
of parameters of limit order books. We review "classical" zero intelli- 
gence models and show, by means of tick-by-tick quote (LI) data, their 
poor fit to liquid markets. Therefore, we define a generalized zero in- 
telligence model which copes with the discrepancies found and devise 
a method of its estimation, which we show to be - up to a minor ap- 
proximation - consistent and asymptotically normal. We demonstrate 
the model to at least mostly fit the data of three US stocks from US 
electronic markets. 

Keywords: Continuous double auction, zero intelligence models, econometric 
estimation, consistency, asymptotic normality 

1 Introduction 



As most of the trading at today's securities markets is done according to the 
continuous double auction, the importance of mathematical modelling of this 
trading mechanism grows. Since it is probably impossible to build at least 
partially tractable model of a market with the continuous double auction 
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assuming rationality of the agents, its existing models are "zero intelligence" 
(ZI), i.e., assuming a purely random arrival and (eventual) cancellation of 
the orders. 

From a nu n aber of si r ailar models of that kind, l et us name IStiglerlftl96l: 
Maslovlfl2nnnh:lLuckockl f l2nn8l ): ISmith et al.l f l2nn3h : iMike and Farmer (20o3; 



Contetal 



All of those models assume unit order sizes, Poisson 
arrivals of market and limit orders with intensity dependent on the current 
best quotes and locally constant cancellation rates of waiting limit orders 
depend ing on the bes t quotes, t o oj^ So me of the models assume continuous 
prices (jMaslovl fennnh : iLuckockl (mj) while the rest of them works with 
(more realistic) discrete (tick) prices. 

As we already indicated, the distribution of even the simplest zero intel- 
ligence models cannot be expressed a n alytically; howeve r, partial theo r etical 



result s are known (see ISlaninal (120011 ) ; ICont et al.l ( I2OIOI ) ; IContI (I2OIII ) ; ISmid 



( 20121)1 including the ergodicity of a discrete model with bounded prices 
(ICont et all (l2010f )) and an analytic formula for the conditional distribution 
of the order books given a history of t he best quotes applicable to the major- 
ity of known ZI mo dels (ISmidl toij )) which, when combined, are sufficient 
for a construction of consistent and asymptotically normal estimators of the 
parameters of the models - one of such estimators is presented by the present 
paper. 

Even if the ZI models are claimed to be are able to mimic many stylized 
facts observed in reality, s u ch as fat ta ils or the non-Gaussianity (jSlanina 



(I2OOII . l2008f ): ISmfdl (l2008h : ICont 



( 2OI1I )) an "old-style" econometrics has, 
as to the best knowledge of the author, never been applied to verify their 
accordance with reality. One of the reasons for this seems to be a difficult 
accessibility of the order book (L2) data. However, as we show in the present 
paper, L2 data are not necessary because a sample dependent on the param- 
eters of the limit order book may be elicited even by means of quote (LI) 
data, even if only closest parts of order books may be estimated that way. In 
particular, we observe sizes of jumps of the quotes out of the spread and use 
the fact that the jumps greater than one imply emptiness of the first tick of 
the order book whose stochastic properties depend on the order placement 
and cancellation parameters. 

We start our exposition by an introduction of a sufficiently general version 



^The only exception is iMike and Farmeij (|2008l ) where the cancellation rate depends, 
in addition, on the order books' size and imbalance. 
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of a ZI model, co yerinK models of ICont et al.l ( I2OIOI ). IStiglerl (Il964r). a dis- 



cretized version of lLuckockl ( l2003l ) and slightly modified version of lSmith et al. 
(I2OO3I ) and we prove the ergodicity of the introduced model. 

Consequently, we demonstrate the inconsistency of the ZI model with 
reality. In particular, we show how the empirical conditional probability of 
larger jumps of ask evidently contradicts its theoretical counterpart!^ 

To accommodate the discrepancies found, we propose a generalized zero 
intelligence (GZI) model. In particular, while we keep assuming Poisson 
order arrival and time-constant cancellation rates, we add the possibility of 
shifting of the orders, allow slowly cancellable (strategic) orders and admit 
multiple orders' placement at the times of the quotes' jumps. Moreover, 
we propose an estimator of the model which we show to be consistent and 
asymptotically normal up to a small approximation. 

Consequently, we apply our estimator to LI data of three US stocks traded 
at three US electronic markets. Our estimation is shown to bring at least par- 
tially significant results fir seven out of the nine stock-market pairs. Moreover 
as the parameters exclusive to the GZI model came out significant in six cases 
our empirical results falsifies the ZI model statistically correct way. 

The paper is organized as follows: First, the ZI model is presented (Sec- 
tion [2]) and the dataset we work with is introduced (Section [3]). After a 
confrontation of the ZI model with the data (Section HI), our generalized 
model is formulated and its theoretical properties, necessary for the estima- 
tion, are stated (Section [5]). Consequently, the estimation is performed and 
the results interpreted (Section [6]). Finally, the paper is concluded (Section 
[7]). Proof of the asymptotic properties of the NLS estimator is given in the 
Appendix. 

2 A Zero Intelligence Model 

Let us consider a general discrete zero intelligence model with unit order sizes 
described by a pure jump type process 

Et = {At,Bt), t>0, 

where 

A = {A],Al...,A^) 



■^Even if this disagreement is shown only graphicahy, our later empirical results confirm 
it a statistically correct way. 
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and 

Bt ^ (Bf , B^. , . . . , Bf) 

are the the sell limit order book, buy limit order book, respectively. In 
particular, (Bt ) stands for the number of the sell (buy) limit orders with 
price p waiting at time t. Further, denote 

at := V min {p : At{p) > 0} , bt :— n + 1 A max {p : Bt{p) > 0} 

the (best) ask, bid respectively. 

The list of possible event causing jumps of H together with their intensities 
is given in the following table: 



rate 


description 


et^e{at,bt) 


An arrival of a buy market order, causing 
A"* to decrease by one (if the sell limit order 
book is empty then the arrival of the market 

order has no effect). 


nt,p = n{atM,p) 


An arrival of a sell limit order with limit price 
p > bt causing an increase of by one. 


Pt,p = A^p{at,bt,p) 


A cancellation of a pending sell limit order 
with a limit price p causing a decrease of A^ 
by one. 


^t = ^iat,bt) 


An arrival of a sell market order, causing B^* 
to decrease by one (if the buy limit order 
book is empty then the arrival of the market 
order has no effect). 


Xt,p = \{at,bt,p) 


An arrival of a buy limit order with limit 
price p < at causing an increase of B^ by 


crt,p = Bfa{at,bt,p) 


A cancellation of a pending buy limit order 
with a limit price p causing a decrease of B^ 
by one. 



Here, 9, k, p, A, a are some functions. It is assumed that all the flows of the 
market orders, the flows of limit orders and their cancellation are mutually 
independent in the sense that the conditional distribution of a relative jump 
time at any flxed time given the history of X up to t is exponential with 
parameter 

n at— 1 n bt 

At^tt+i»t+ Yl '^p,t+J2^P'i+J2^tPp,t+J2^t''p^t 

p=bt+l p=0 p=at p=0 
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and the probability that the next event will be a a particular one equals to 
Qt/ where Qt is the event's intensity. It is obvious that S is then a Markov 
chain in a countable state space. 

The following table shows how some of the models, mentioned in the 
Introduction, comply with our setting. 



model 


e 




K 


p 


A 


a 


Luckock (2003) 


K{ht) 


1 - L{at - 1) 







Ap 





Smith et al. (2003) 


9" 


9' 




p' 




P' 


Cont et al. (20101 




9^ 


i^^iP - h) 




K^'iat -p) 


P^iat - p) 



Here, = K{p)—K{p—1), = L{p) — L{p—1) where K, L are (continuous) 
cumulative distribution functions and 



are some functions and the rest 



of the the symbols are constants . When spea king about iLuckockl ( 120031 ) . we 



Smith et al. 



have its discreti zed version ( see ISmidl (120121 ). Sec 3.3) on our mind. When 



speaking about 'Smith et al.l (20031). we are considering its bounded version 
(i.e., contrary to 



(j2003[ ) we assume zero arrival intensities for 
prices less than one and greater than n; since the bounds o f our model may 
be set arbitrarily large, we can approximate ISmith et al.l (|2003[ ) by our ZI 
model with an arbitrarily high accuracy). 

Some of the models from the Introduction were not mentioned in the 



table: We did not include iMaslovl ( l2000l ) due to potential technical diffi- 
culties with its discretiza tion and bec a use, anyway, its discretized version 
would be very similar t o ISmith et al.l ( l2003l ) with = 0. The model by 
Mike and Farmerl (120081 ). on the other hand, was not included because of 
its complicated cancellation model and becau s e, ap art from the cancella- 
tions, i t is very simi lar to that of ICont et al.l ( l2010l ). Finally , we d id not 
include IStigler (Il964f ) because it is a special version of iLuckockl ( 120031 ) (with 
K{x) = L[x) = x). 



Proposition 1. // p{* 

ergodic. 



> 0, a{») > 0, 9{») > and ^^(•) > 0. then Et 



IS 



Proof. Our proof mimics that of ICont et al.l (120101 ). Proposition 2 verifying 
the ergodicity of their model by finding a Markov chain in N dominating the 
total number of orders with a recurrent zero state; recurrence of the state 
then proves a recurrence of the zero state of the dominated model which in 
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turn verifies the ergodicity (see ICont et aL Here we put 

n 

L =y [maxK(a, 6, + max A(a, 6, 

and 



a,fe a,b 
p=l 



Mi = min t{a, b) + min i}{a, b) + i min[p(a, b, p) A a{a, b, p)] 

a,b a,b o,,b,p 

and argue that a Markov chain St may be constructed such that 

n 

Y,[\A'A + m] < St 
p=i 

and having intensity matrix P = {pij)ij(z-^ with all non-diagonal components 
zero except of = L,i >0, and Pi^i-i = Mi, i > 0. □ 

Thanks to to the symmetry between the sell and buy order books in our 
model, it suffices to deal only with the sell order books until the end of the 
paper. 

Denote ti,t2, ■ ■ ■ the jumps of (a, b) (i.e., the two dimensional process of 
the quotes). Further, denote 

1 if Aat^ > 0, 
Ui= {-1 if Aat, < 0, ieN, 
^ otherwise 

and 

Ei = l[Aat, > 1], le N, 

an indicator of more than unit jumps. 

Note that, due to the unit size of market orders, the ask may jump more 
than one tick only if no limit orders with the price one tick above the ask are 
present in the order book at the time of the jump, i.e., 

E, = l^ Mi = 0,U, = l Mi = A''' , pi = at^_, + l; 

'■i-l 

Before proceeding to the distribution of Mj, note that, for each z G N, 

Mi = M° + M+ 

where is the number of the orders, present in the book already at the 
time which were not cancelled until tj and M^^ is the number of the 
orders, newly arrived and not cancelled between and ti. 
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Proposition 2. 



M^\Et^_^,ti,Ui,Ui^i ~ Bi [mi,6{at^_^,bt^_^,Ati)) 

6{a, b, t; k, p) = exp{— p(a + 1, a, b)t}, 
and, if p(«) > then 



rrii 



APi 



'j{a,b,t; K,p) 



K(a + 1 



- 6{a,b,t] K,p)) 



p{a + 1, a, b) 

Moreover, Mf and are conditionally independent given 'Et-_-i^,ti, Ui, f/j-i. 

Proof. Note first that, from tlie Markov property, botli and are 
conditionally independent of Ui-i given lience it need not be considered 
wfien speaking about tlie distribution of tlie M's given Et, , and tfiat, tlianks 



to the strong Markov property (Theorem 12.14 of lKallenbergI (120021 )). we can, 
without loss of generality, assume S^. ^ to be deterministic. Further, note 
that we would not change the distribution of S up to if we assume that 



E' < s < At,-, where E: 



and A'^,B'^\ s > 0, 1 < z < n. 



are independent immigration and death precesses with the corresponding 
intensities determined by S^-^, independent also on the first market order 
arrival time and type. Now, as there are two possible causes of a jump of 
a up - a market order arrival and a cancellation of the last order with the 
ask price - which both are caused by either the order flow at tick _ or the 
market order process, we are getting that Atj is independent of A'^\ so the 
distribution of may be computed as if the market order flow and 



and consequently Atj and f/, were deterministic (see ISmidl (120121 ). Lemma 
A.l. and the discussion below). Our problem hence reduces to finding the 
distribution of and immigration and death process with an initial value m, the 
immigration rate k and death rate p at a fixed time t, which is known to be a 
convolution of a Binomial distribution with parameters m and exp{— ptj an d 
a Poisson distribution with intensity ^(1 — exp{— pt}) (see also ISmidl (120121 ). 
p 80, for the derivation of the formula and Proposition 4.2 therein for a more 
exact proof concerning a more general model). □ 



As a direct consequence of the Proposition, we are getting: 
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Corollary 1. Under the assumptions of Proposition^ 



ijj{a, b, t, m; k, p) = exp {—7(0, b, t; k, p)} [1 — 6{a, b, t; k, p)]"^ 
whenever Ui = 1. 

The last result, in fact, opens a door for estimation of the parameters of the 
order books by means of LI data because it enables us to indirectly observe 
Mi whose distribution depends on the parameters in interest which may be 
consequently estimated by means of the sample of such Ei, for which rrii is 
observable, which is exactly in when = — 1. 

Remark 1. If k{p, a, b) = k{p—b) and p(p, a, b) = p(p—b ) for all a, b,p where 
K and p are completely unknown (as in Cont et oZI (20 id) , for instance) then, 



theoretically, all k{2), . . . , K{n — 1) and p(2), . . . , p{n — 1) could he estimate^ 
(note that S is ergodic so the number of occurrences of each state goes to 
infinity). In practice, however, there will not be enough observations for 
estimation of the n 's and p 's with higher arguments because the probability of 
states with a large spread (which would be needed to estimate the intensities 
for higher values of the arguments ) is very low. 



3 Data 

We use LI data, i.e. the history of the process 

{at,A''\t,bt,B'^^), t >0, 

for three US stocks 

• Microsoft (MSFT), 

• General Electric (GE) 

• Exxon Mobile (XOM) 

at three US electronic markets 

• NASDAQ, 



^Note that k(1) and p(l) can be estimated directly from the LI data) 
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• ARCA (NYSE) 

• ISE 

for our estimation. The data come from the one year period starting 12/2008 
and ending 11/2009. Figure [1] shows basic characteristics of the data. 

The reasons for which we do not use the L2 (order book) data are three. 
The first one is the aheady mentioned difficult accessibihty of the hmit order 
data; the second one is the fact, that the estimation by means of the quotes 
is indirect and hence bringing less information than using the L2 data, is 
compensated by the much larger amount of available LI data. The third 
advantage of LI data could be a possible existence of hidden limit orders, 
which might spoil the information provided by the order data; as it is clear 
from our treatment, this problem is avoided by using the quotes. 

A clear disadvantage of the estimation via LI data is that only the parts 
of the order books close to quotes may be estimated this way; however, in 
very liquid markets, this fact does not harm the estimation of the distribution 
of price movements and/or price impact (which are often the primary topics 
of an interest) much, because the influence of the deeper parts of the order 
books in liquid markets is weak. 



4 Empirical Evidence 

It follows from the ergodicity of S and Lemma [T] (see Appendix) that the 
(conditional) probabilities 

u\t) := P(Mi = 0|ti = t, mi = 0,Ui = 1, Ui_i = -1) 

and 

w+(t) := P(Mi = 0|ti = t, mi > 0, , t/i = 1, Ui^i = -1) 

viewed as functions of t, may be approximated by step functions defined by 
corresponding empirical frequencies; more preciously, thanks to the ergodic- 
ity, 

= el^it), t > 0, 
for 6 > small enough and N large enough, where 

oo y^TV ji 

4At) = J2l[te[{^- 1)6, z5)]<_i),,„ = 

i=l l^i=l '^t,S 
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M'St'l' at iNasdaq 



m/y 
12/2008 
1/2009 
2/2009 
3/2009 
4/2009 
5/2009 
6/2009 
7/2009 
8/2009 
9/2009 
10/2009 
11/2009 



At 
2.39 
3.47 
4.40 
4.64 
4.08 
5.14 
5.65 
6.32 
7.29 
7.72 
5.35 
7.40 



# 
0.98 
0.67 
0.53 
0.50 
0.57 
0.46 
0.41 
0.37 
0.32 
0.30 
0.44 
0.32 



MSi^"i' at ARCA 



at INasdaq 



m/y 
12/2008 
1/2009 
2/2009 
3/2009 
4/2009 
5/2009 
6/2009 
7/2009 
8/2009 
9/2009 
10/2009 
11/2009 



1.51 
1.51 
1.48 
1.48 
1.48 
1.49 
1.43 
1.51 
1.51 
1.50 
1.50 
1.52 



At 
2.75 
4.12 
4.99 
4.34 
6.84 
8.13 
13.55 
14.24 
15.57 
8.96 
9.21 
14.07 



# 
0.85 
0.57 
0.47 
0.64 
0.34 
0.29 
0.17 
0.16 
0.16 
0.26 
0.26 
0.16 




GE at ARCA 



XUM at iNasdaq 



m/y 
12/2008 
1/2009 
2/2009 
3/2009 
4/2009 
5/2009 
6/2009 
7/2009 
8/2009 
9/2009 
10/2009 
11/2009 



2.48 
2.18 
1.85 
1.91 
1.93 
1.92 
2.00 
1.90 
2.00 
1.94 
1.71 
1.84 



At 
0.36 
0.42 
0.56 
0.63 
1.00 
1.26 
1.30 
1.40 
1.78 
1.99 
1.61 
1.36 



# 
6.55 
5.54 
4.20 
3.69 
2.36 
1.86 
1.80 
1.67 
1.31 
1.17 
1.66 
1.70 



XOM at ARCA 



m/y 
12/2008 
1/2009 
2/2009 
3/2009 
4/2009 
5/2009 
6/2009 
7/2009 
8/2009 
9/2009 
10/2009 
11/2009 



1.73 
1.52 
1.64 
1.64 

1.56 
1.62 
1.57 
1.57 
1.58 
1.54 
1.48 
1.66 



At 
3.00 
4.72 
6.44 
5.99 
6.06 
7.23 
7.72 
8.60 
10.66 
11.53 
8.44 
10.39 



# 
0.78 
0.50 
0.43 
0.39 
0.39 
0.32 
0.30 
0.27 
0.22 
0.20 
0.28 
0.22 



m/y 
12/2008 
1/2009 
2/2009 
3/2009 
4/2009 
5/2009 
6/2009 
7/2009 
8/2009 
9/2009 
10/2009 
11/2009 



1.57 
1.53 
1.52 
1.51 
1.54 
1.56 
l.,54 
1.52 
1.52 
1.73 
1.50 
1.50 



At 
2.83 
4.50 
4.89 
4.19 
6.68 
8.01 
12.57 
14.38 
15.26 
10.16 
11.32 
16.38 



# 
0.82 
0.52 
0.48 
0.66 
0.35 
0.29 
0.19 
0.16 
0.15 
0.23 
0.21 
0.14 



m/y 
12/2008 
1/2009 
2/2009 
3/2009 
4/2009 
5/2009 
6/2009 
7/2009 
8/2009 
9/2009 
10/2009 
11/2009 



2.37 
2.18 
1.82 
1.82 
1.84 
1.95 
1.88 
1.88 
1.90 
1.89 
1.73 
1.90 



At 
0.34 
0.43 
0.46 
0.63 
0.96 
1.13 
1.15 
1.23 
1.50 
1.76 
1.61 
1.27 



# 
6.78 
5.47 
6.09 
4.40 
2.43 
2.08 
2.03 
1.90 
1.56 
1.33 
1.66 
1.81 



MSJ^'l' at 

m/y 
12/2008 

1/2009 

2/2009 

3/2009 

4/2009 

5/2009 

6/2009 

7/2009 

8/2009 

9/2009 
10/2009 
11/2009 



TSET 



GE at iSE 



XOM at iSE 



At 
1.83 
3.05 
3.06 
3.08 
3.57 
4.40 
5.68 
6.13 
7.22 
7.53 
5.12 
7.16 



# 
1.25 
0.77 
0.76 
0.76 
0.65 
0.53 
0.41 
0.38 
0.32 
0.31 
0.46 
0.32 



m/y 
12/2008 
1/2009 
2/2009 
3/2009 
4/2009 
5/2009 
6/2009 
7/2009 
8/2009 
9/2009 
10/2009 
11/2009 



1.90 
1.83 
1.73 
1.68 
1.67 
1.69 
1.70 
1.76 
1.71 
1.56 
1.57 
1.55 



At 
1.83 
3.19 
3.49 
2.50 
4.22 
6.08 
9.63 
12.30 
12.51 
7.89 
8.66 
13.08 



# 
1.25 
0.73 
0.67 
0.93 
0.66 
0.46 
0.24 
0.19 
0.19 
0.30 
0.27 
0.17 



m/y 
12/2008 
1/2009 
2/2009 
3/2009 
4/2009 
5/2009 
6/2009 
7/2009 
8/2009 
9/2009 
10/2009 
11/2009 



19.50 
10.34 
8.55 
7.30 
4.75 
4.59 
4.09 
5.13 
4.20 
3.22 
3.33 
4.40 



At 
0.56 
0.55 
0.44 
0.47 
0.61 
0.66 
0.71 
0.76 
0.81 
0.96 
0.77 
0.76 



# 
4.12 
4.24 
5.35 
5.03 
3.85 
3.57 
3.29 
3.07 
2.90 
2.46 
3.05 
3.01 




Figure 1: Summary data for the examined stocks and markets. The table: 
m/y - month/year, s - average spread, Atj - average time between jumps of 
the quotes, ^ - number of jumps of the quotes a day (in 10000). The graph: 
vertical lines - number of jumps, the jgirve - average spread. 



5 10 15 20 25 30 35 



(a) e° jY against t 



5 10 15 20 25 30 35 40 45 



(b) against t 



Figure 2: XOM at ISE - empirical probabilities 

JU = l[Ue[t,t + 5),m, = 0,U, = l, U,^i = -!]. 

Function oj~^ may be approximated analogously - denote e^^ its approxima- 
tion. Figure m shows such approximations in case of XOM at ISE. 

Now, since \imt^ou{a, b, t, 0) = for each a, b and since = KaQfig{oj{ao, bo, t, 0)) 
it should be limt^o u'^ (t) = and consequently limt_j.o e°(t) = 0. However, 
the converse is true as indicated by Figure H] (a) where the empirical version 
of a;° clearly tends to a non-zero value at zero. 

Similarly, looking at the approximation of (Figure|l](b)) we are finding 
that even if it should be approach one at zero (note that lima; (a, b, t,m) = 1 
for any a, b and m > 0), it tends to a clearly non-unit value. 

Finally, as, for any a, b, m, limt_>oo ^{o,, b, t,m) > l for some l independent 
of a, b, m given a non-zero cancellation rate, the limits at oo on both (a) and 
(b) of Figure H] should be non zero which does not seem to be the case. One 
may object here that zero limits could be caused by a zero cancellation rate; 
this would, however, imply u{a, b, t,m) = for any m > (the reason is that 
the initial orders would never be cancelled so the jump of a could never be 
non-unit) which is clearly not the case, too. 

Summarized, it is clear that the ZI model introduced in Section [2] is 
too poorly parametrized to explain the empirical data a satisfactory way, 
because, in reality, 

(LO) Limit of P(Mj = 0|mj = 0, tj = t) as t — is not zero. 
(L+) Limit of P(Mj = 0|mj > 0, tj = t) as t is not one. 
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(R) Limit of P(Mi = 0|tj = t) as t — )• is non-zero. 

each of which points contradicts the model. 

Clearly, our graphs do not falsify the ZI model statistically correct way 
(nothing was said about the rate of convergence of the estimates of u). How- 
ever, these preliminary findings are statistically confirmed in the next Section 
where a sup-model of the falsified model is shown to be significant in a ma- 
jority of cases. 

Before proceeding further, however, one more discrepancy between the 
model and reality not discussed yet has to be remembered: 

(U) the volumes of the orders are not unit in reality. 

5 Generalized Zero Intelligence Model 

In the present Section, we introduce a generalization of the ZI model intro- 
duced in Section [2] so that it complies with empirical facts (LO), (L+), (R) 
and (U). 

Starting with (LO) note that this in fact says that unexpectedly many 
limit orders appear during short time periods after jumps of the ask (note 
that the LI data cannot contain values of if the ask jumps down more 
than one tick at so we, in accordance with the definition of the model, 
only assume that mj = 0). We suggest two explanation of those unexpected 
orders' emergence: 

1. If a jumps down more than one tick down at then not only a new 
limit order (with limit price equal to the new ask) is put into the spread 
but some additional one(s) are possibly placed into the ticks between 
the old and the new value of the ask (perhaps by algorithmic trading) 
- we assume the numbers of these orders to be Poisson with parameter 
7o- 

2. At times of events causing a to jump up (i.e., a market order arrivals 
and cancellations of the ask) one or more limit orders is (perhaps by 
algorithmic trading or as a result of a shift of the quote) put one-tick 
above the original ask at the time of the jump - we assume the number 
of those orders to be Poisson with parameter tjq. 
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The fact (L+), on the other hand, indicates that some hmit orders expected 
are missing, which may suggest that a jump of a at down might be 
caused not only by a placement of a limit order into the spread but also by 
a shift of the ask down. Therefore, we assume: 

3. At the time of a jump of a down (i.e. a placement of a new order 
into the spread) each of the orders formerly being the ask might be 
cancelled with probability ao 

Coming to (R), the most natural explanation this fact seems to the existence 
of strategic long-term orders, so we suppose: 

4. Orders with a negligible cancellation rate Qq arrive with a rate Aq ^ Qo 
into each tick greater than the ask. 



Finall y, taking (U) into account, we speculate, similarly to e.g., ISmith et al. 



( I2OO3I ). that the volume of all the orders is /io > instead of one, i.e. that 



the " actual" market follows a process 

(A,B) = (/ioA/ioi?). (1) 
As to the arrival and cancellation ra t es of t he generalized model, we assume. 



similarly to e.g. iMike and Farmer! (l2008[ l. that K{a,b,p) = k{p — a) and 



p(a, b,p) = p{p — a) for each b and p > a. 

Now, let us define our generalized model so that it takes 1.-4. and ([1]) into 
account. Denoting and the numbers of short term, long term orders, 
respectively, and, quite naturally, assuming that any long-term order turns 
into a short term one if it becomes an ask, we get the list of events in our 
generalized zero intelligence (GZI) as follows: 
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event rate 


description 


et^e{at,bt) 


An arrival ot a buy market order, causing 
A"' to decrease by one. If its resulting value 
is zero then is increased by a random 
number having Po{'r)o) for some 77 > (the 
increment is independent of the evolution of 
the past of the whole process) . 




An arrival of a short-term sell limit order 
with limit price p > bt causing an increase 
of by one. If p < aj then each order with 

Lilt: |JilL.L. LyULldl LU ulic Uilg,llicll doiv lo L/diiL/dlCU. 

with probability ao, if the jump of a is more 
than one tick then each A^, at- < zu < at is 
increased by a random number ~ Po(7o) 
for some 70 > (the random increments and 
cancellations are mutually mdependent and 
independent on the past of the whole pro- 
cess) . 


Pt,p = ^tPo 


A cancellation of a pending short term sell 
limit order with a limit price p causing a de- 
crease of ~AP by one. If its resulting value 
is zero then A"*"*"^ is increased by a random 
number having Po(?7o). 


K,p = Ao 


An arrival of a long term limit order with 
price p > at increasing by one. 


Qt,p = Mqo 


A cancellation of a long-term order with price 

P 




'I'he definitions concerning sell market orders 
and buy limit orders are symmetric. 



If assume the "independence" analogously to our ZI model, we get that 
Proposition 3. 

E^{A,A.B,B) 
is Markov chain with a countable state space. 

Moreover, 

Proposition 4. // p(») > then S is ergodic. 
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Proof. As the increase rate of the total number of the orders is bounded 
from above (by 2n(Ao + r^o + 7o + max(K(«)))) while the decrease rate is 
bounded from below, the total number of limit orders is, similarly to the 
proof of Proposition [H bounded from above by an ergodic Markov chain 
which proves the Proposition. □ 

A brief analysis shows that, now, 

Mi = M^ + M+ + M+ + M°+ + Ml+ 

where is the number of uncancelled orders out of those present at Ml^ 
and M+ are the numbers of uncancelled short-term orders, long-term orders, 
respectively, out of those having arrived during M^"*" is the number 

of (cancellable) orders having appeared at and M/'^ us the number of 
the orders having arrived during the jump of a up. 
Denote 

= ('='ti_i) tiy Ui). 

Similarly to the ZI model, we have 

M+\n, ~ Po (^^ (1 - exp{-poAU})^ , 

where = k{1) and po = p(l)) clearly 

M+ia ~ Po (1 - exp{-f?oAt,})^ 

which we, however, approximate by taking a limit as Qq ^ and getting 

M+|a~Po(Aot), 

(we did this approximation also because a negligible parameter could hardly 
be estimated). 

Further, from the definition, 

M,i+|a~Po(r/o), 

and 

M°+|a ~ Po(J(mi)7oexp{-poAti}), /(m) = l[m = 0] 
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(we have used the fact that if we perform a Bernoulh trial on each "unit" of 
a Poisson variable, the sum of successes is Poisson with the original intensity 
multiplied by the Bernoulli probability). 

Finally, denoting mj the actual volume one tick above the ask, i.e. 

mi = fioirii = A^l^, 

we have 

M°|a ~ Bi ([/i~'mi], (1 - ao) exp{-poAti}) . 
For computational simplicity and a subsequent reduction of the parame- 
ter space, we however use the well known approximation Bi(n,p) ""^J^^"^ 
Po(lim?T,p), to get 

M°|a~Po(/3oexp{-poAti}mi) , (3o = fi^^l - ao). 

From our assumptions of independence we now get that 

F{Ei = l\Qi) = w{mi, Ati] kq, 70, Xq, /3o, Vo, Po) (2) 



= exp |-- [1 - e-P^] - -fI{m)e-P* - At - m/^e"''* - r^j 

where Go = (fi;o,7o, Xq, (3o,rio, po) > are parameters of our interest. 
To estimate flo, we use a non- linear least square estimator 

n 

e„ = argmine>o5'n(0), 5'„(6) = ^[-Ei - 07(111^, AU; O)]^ 

i=l 

where we put Ei = w{mi, Ati)) = by definition whenever f/, 7^ 1 or 
f/.-i ^ -10 

Proposition 5. // the random element [Ei, is such that the distribu- 

tion of fli as in the GZI model and the conditional distribution of {Ei)i^n is 
given by ([^ and if the parameter space is bounded from above, then B„ is 
consistent and, if po > 0, Pq > 0, also asymptotically normal. Moreover, the 
minimization of NLS is an asymptotically convex problem up to a continuous 
transformation of parameters. 

'^Because k and p do not depend on b in the GZI model, we may take ti as jumps not 
of the whole (a, b) but of only a. 
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Proof. See Appendix for all the details. 



□ 



Remark 2. Even if, due to a heteroskedasticity, the weighted version of the 
NLS estimator suggests itself to be used here, we do not apply it because of a 
large impact of possible outliers with small predicted residual variance ( note 
that the variance of Ei tends to zero once w is close to zero or to one). 

Remark 3. Note that we do not deal with the market order arrival rate; the 
reason for this is that it could be easily estimated directly form the LI data. 
Further, as we already said, we leave aside parameters K{i),p{i), i > 1% 

Remark 4. Before proceeding to the estimation, let us also note that the 
asymptotic properties of our estimator would not be lost if we used the exact 
conditional probability of Ei instead of the approximation. 



6 Empirical Evidence Revisited 

In Figure [31 the results of the estimation of the parameters of the GZI model 
for all the combinations of the three stocks with the three markets, mentioned 
in Section [31 In case of each stock-market pair, we used a sample of at most 
500.000 observations of Ei for such that a jumped up at ti and a jumped down 
at ti-i. In addition to the usual statistical analysis including point estimates, 
standard errors and significance levels, we illustrate the fit of predicted shape 
of TT by its empirical counterparts. In particular, we present four graphs, the 
fc-th one depicting a comparison of a predicted and an empirical version of 

Pk{*) :=P(E. = l|Ati = «,mG4), 

where Ji = {0} (the top left graph) and J2, /s and J4 are chosen so that they 
contain roughly the same number of values of m^. Similarly, the time axis is 
distorted so that equal intervals on the x-axis contain comparable numbers 
of observations of Atj. The graphical comparison is an in-sample one. 

Naturally, the model fits much better those stock-market pairs which 
provide larger samples, i.e., those for which the ask jumps more often and by 
larger magnitudes. If - on the other hand - the number of jumps is too little, 
the method crashes totally, as it can be seen in case of GE at NASDAQ. 

^In principle, our methodology could be used to estimate e.g. k(2) and p(2) by means 
of discerning whether the jump of a is by one, two or more ticks, here, however, we omit 
this for simplicity. 
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MSFT at Nasdaq 

12/2008-11/2009 

K = 8.41(328.82), 7 = 0.62(0.31)*, 
A = 17.67(196.69), = 0.24(0.04)***, 
77 = 1.97(118.22), p = 1.78(10.36), 

EE/ n=322/129084, 
AS = 1.00, S = 2.02 




GE at Nasdaq 

12/2008-11/2009 

K = 2579.72(err), 7 = O.OO(err), 
A = 12.78(err), P = 0.30(err), 

ri = 1.68(crr). p = 64.48(err), 

EE/ n=89/82766, 
AS = 1.00, g = 1.89 




XOlVi at Nasdaq 

12/2008-3/2009 

K = 0.00(n/a), 7 = 0.62(n/a), 
A = 13.07(n/a), ^ = 0.21(n/a), 
77 = 1.45(n/a), p = 0.00(n/a), 

EE/ n=9419/500000, 
AS= 1.03, s = 2.39 
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MSJ^'T at ARCA 

12/2008-11/2009 

K = 0.83(8.29), 7 = 0.00(0.24), 
A = 3.31(3.50), /9 = 0.10(0.02)***, 
ri = 2.33(3.46), p = 1.38(1.02), 

EE/ n=305/101115, 
= 1.00, S = 1.96 




GE at ARGA 

12/2008-11/2009 

K = 10.51(5.60)*, 7 = 0.00(0.46), 
A = 3.48(1.33)**, /3 = 0.15(0.04)***, 
ri = 2.19(0.70)***, p = 6.11(2.06)**, 

EE/ n=124/86922, 
Aa = 1.00, s = 1.96 




XOM at ARGA 

12/2008-3/2009 

K = 19.26(0.47)***, 7 = 0.08(0.05), 
A = 0.64(0.07)***, /9 = 0.23(0.01)***, 
77 = 1.71(0.07)***, p = 6.79(0.43)***, 

EE/ n=8977/500000, 
Aa = 1.02, S = 2.39 
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MSJ^'T at iSE 

12/2008-11/2009 

K = 20.50(2.86)***, 7 = 2.15(0.17)*** 
A = 2.98(0.68)***, /3 = 0.15(0.01)***, 
7) = 1.21(0.29)***, p = 7.99(1.06)***, 

EE/ 11=1813/212607, 
Aa = 1.06, s = 2.17 




GE at iSE 

12/2008-11/2009 

K = 0.09(12.08), 7 = 1.19(0.21)***, 
A = 3.80(4.19), /3 = 0.06(0.01)***, 
ri = 2.68(6.66), p = 0.84(0.70), 

EE/ n=834/173372, 
Aa = 1.02, s = 2.05 




XOM at iSE 

12/2008-5/2009 

K = 0.79(0.03)***, 7 = 0.58(0.01)***, 
A = 0.17(0.01)***, = 0.024(0.00)**" 
77 = 0.44(0.01)***, p = 1.13(0.11)***, 

EE/ 11=232725/500000, 
Aa = 4.40, s = 10.23 
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Figure 3: Results of estimation. Table: T,E - number of jumps of a up 
greater than 1, n - sample size, Aa - average jump of a up, s - average 
spread. Graphs: vertical dotted line - average value of E, curved dotted hne 
- predicted shape of p^, points with crrorbars - its empirical version. 



Roughly speaking, for the method to be successful, the stock has to be XOM 
or the market has to be ISE (their combination fitting the best) which leads 
us to a conclusion that our method is more suitable for moderately liquid 
stocks/markets than for the super-liquid ones. 

7 Conclusion 

We have formulated a partially tractable and estimable model of a hmit 
order market which agrees with the data better than any of the previously 
published zero intelligence, which is illustrated by means of LI data of three 
US stocks. 
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A Appendix 



First, let us state a nearly obvious theoretical result which we use. 

Lemma 1. If X is a continuous time stationary ergodic Markov chain in 
countable space then Y = (Atj, X^-JigN; where Ti is the i-th jump time, is 
stationary ergodic stochastic process. 

Proof. Denote A(x) the intensity of the jump of X given that the state of 
X = X. From the strong Ma rkov property (Th eorem 12.14 of iKallenberg 



(120021 )). from Lemma 12.16 of IKallenberg (120021 ) and from the scalability of 



exponential distribution we have that Ui, U2, ■ ■ ■ , Ui = Arj/A(XT--_ J, i G N, 
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is a sequence of i.i.d. (unit exponentially) distributed variables, independent 

of Xq,Xt-j^, .... As Un - bei ng an i.i .d. sequence - is a strong mixing and 

Xt„ is a strong mixing by Bradleyl (20051), process Z„ := {Xr.^,Un) is a 
strong mixing (for any "rectangles" A = x and B = x B'^, ¥{Z e 

A n tnB) = ¥{Z^ e {A^ n u.b^))F{z^ e {A^ n tnB"^)) ¥{z^ e A'^)¥{z^ e 

t^B^)¥{Z^ G A^)F{Z^ e t^B^) = F{Z e A)¥{Z e B), the case of general 
A, B follows from their approximation by rectangles) we get the ergodicity 
of Z by the well known fact that strong mixing implies ergodicity. Finally, 
as Yn is a function of the Lemma is proved. □ 



A.l The NLS estimator 

Before proving the desired properties of our NLS estimator, we perform a re- 
parametrization of the problem simplifying the further theoretical analysis. 
In particular, we introduce two additional parameters: 



P'' 



K 



C = - + v- 
p 



The vector of the parameters is now 

Tf'o = (00,70, Ao,/3o, Co, Po) 

and we have 

F{Ei = l\ni) = wiini,AtuT) 
w{m,t; 9) = exp {-C - -fI{m)e-P^ - At - m/^e"^* + ^e"''*} 



A. 1.1 Convexity of minimization 

By a textbook differentiation 

Vr^ifn, t; T) = —6{m, t; T)ta{m, t; T), 



6{m, t; T) 



I{m)e-P^ 
t 



me 



-pt 



te [(p — I{m)'j — m(3] 



-pt 



-1 

I{m) 
ePH 
m 

t [0 — I{m)'~f — m(i\ 



(3) 
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and 



VrT'coim^ t; T) = —to^m, t; T)[H{m; T) + 6{m, t; T)5(m, t; T)' 

1 









1 -I(m) -m tk 



H{m,t;T) = te-P' 



—I{m) 

— m 


I{'m)'~f — 171/3] 



(4) 



Hence for 
we have 
and 



s{E, t, m; T) = [w{m, t; T) - Ef 
Vts = 2Vr^[uj - E] 



Vtts = 2(VTtJ7)(VTt^)' + 2(ti7 — E)\/rr'^ 

= 2zu^65' -2w{vj- E){te-P*H + 55') 

= 2vj{E - w)te-P^H + 2wE55' 



So 



n — ^ 



Since, by the Lemma[T]and Birkhof theorem, the left hand sum goes to zero as 
n — > 00 and the convergence is uniform if we bound parameters 0, 7, (3 from 
above (which is assumed), the minimization of S is asymptotically convex 
problem (note that 55' > 0). 

A. 1.2 Consistency and Asymptotic Normality 



We use the theory I JacobI (I2OIOI ) to prove the asympto tic properties of our es- 
timator. First, note that /fc, 77^, f^, f^', dk and M„ from |jacoJ( 2010) evaluate, 
in our case, as 

/a;(T) = ui{mk,Atk; T), 
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r]k = Ek- /fc(T) 
f^(T) = -zu^nik, Atk, T)5(mfc, A^; T) 
f^(T) = w{nik, Atfc; T) [S{nik, A^; T)5(mfc, A^; T)' + H{ink, A^; T)] 
4(T) = w{nik, Atfc; Tq) - w{mk, A^; T), 



Ct7^(mfc, Atk] T)5(mfc, At^; T)5'(mfc, A^; T) 



k=l 



First let us verify the conditions, required for the consistency: 

• LIPx({ci7(mk, Atk; T)}): From the assumed boundedness of T and 
from ([3]) it follows that the derivative of w is uniformly bounded by a 
non-random constant B so the condition is fulfilled with gk = 1 and 

h{x) = -B||a;||. 



VARt: As vai{Ej\Qi) < 1, the condition follows from iJacobI (120101 ). 
Proposition 5.2. 

SlT({-Dn(T)}): It follows from Q and the non-negativeness of (po 
that, for all 1 < J < 6, ^d{0,t;T) ^ for any t > 0. From 
the continuity of the derivatives it further follows that, for any t^, 
ri > 0, . . . , rg > exist such that \dTjd{0,t; T)| > rj for any T from 
the parametric space (recall it is bounded) from which it follows that 
|'^fc(T")| > min(ri, . . . ,r6)||T — To|| whenever Atk < to- Hence, for any 

N 

inf V'4(T)^ > i^:Armin(ri, . . . ,r6)5 



||T-To||>5 ^ 



i=l 



where = \{k < N : = 0, At^ < toll- Thanks to the ergodicity 
of S and Lemma [H — )■ oo, so the condition is proved. 



Thanks to the conditions and Proposition 3.1. of iJacobI ( 120101 ) the strong 
consistency of T„ is proved, implying the same for 0„. 

As to the asymptotic normality, note first that, from the ergodicity. 



■Q(m,t) 



(5) 



where Q is a stationary distribution of the process (nij, Atj). 
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Lemma 2. If p y^O and P then M is regular. 

Proof. We show that M~ > 0, i.e. 

v'M'v > 0, V ^0, 

which suffices for the regularity. As 

M-^Edd', d^zu{m,t;T)5{m,t;T), {m,t) ^ Q 

which clearly implies that M~ > 0, it suffices to show that d'v 7^ on a set 
with a positive probability for any non-random vector v: in particular, as 
zu > everywhere, it suffices to show that 

r{m,t) > 0, with a non-zero probability, r{m,t) = v'S{m,t; T) 

which we do by contradiction: Assuming a converse we get 

00 „ 

J2pj / l[r{j,t)y^O]g{t\j)dt^O 
3=0 

where is a density of t given m and pj is a probability that m — j. As the 
distribution of i is a mixture of exponential distributions, g{»\m) is continu- 
ous non-zero, and, as m is a mixture of Poisson distributions, pj > 0, j G N; 
therefore, for each j G N there should exist a non-trivial interval Jj such that 

r(j,i) = 0, teJj.jen 

which would, for each j, imply the existence of tj and > such that 

|r(j,^, + s) = se[-A,-,A,] 

yielding 

e-P^oe,-P^{yi - V2l{3) - v^j - v^Cj{tt + s - 1)) + = 0, se [-A,-, A,] 

where Cj — (j) — + jP, which could happen only if 

V3 = 0, vi- V2l{j) - V4Cj = 0, VQCj = 0, j e N. 

From the regularity of matrix (1, /(j), Cj)^^Q, we are getting Vi = V2 = V3 = 0. 
Moreover, since at least one Cj,j e N, is non-zero, we have also ve — which 
implies v — (0, 0, 0, 0, v^, 0), 7^ 0. However, since v'S{m, t; T) = vf, we are 
getting V5 — which is a contradiction to non-triviality of v. □ 
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Consequetnly, it follows from the continuity of the inversion operator that 
also 

nMn ^ M, M := (M")"^ (6) 

Now let us verify the conditions needed by JacobI ( 2010 ) to get the asymptotic 
normality: 

• LIPj(ffc.j ;): Follows from the twice-differentiability 

• VARx was discussed when proving consistence 

• UNC({T„}) follows from 

• SI({(M„[i, j])~^}) : Since M" > 0, it has to be M > implying posi- 
tiveness of its components which proves, together with (E]) the conver- 
gence rate of M„ to be exactly from which SI follows easily. 

• LIM({T„}) : follows from the convergence of M„ to zero and bounded- 
ness of the derivatives. 

Now, put 

Clearly, „ is a triangular array of martingale differences with 



lim \2^i^i,n^I,n\^hn, ■ ■ ■ , ^i-l,n) = hm -An = A 



An = 5^a„ a, = r^;,(To)(l -u7,(To))f^(To)ff (To) 

A = E{zu{m, t; To)(l - zu{m, t; To))f' (m, t; To)f'^(m, t; Tq)) 
by the ergodicity. Therefore and because, for each e > 0, 

n 1 

^^iUi,n\\'^MU^.n\\>e]\^l,n, ■ ■ ■ ,^i-l,n) = " II «i 1 [i ||a, || >e] ^ 

i=l ^ i=l 

(recall that \rii\ < 1), we can use the CLT for multidimensional martingale 
difference arrays (as cited bv l.Tacobl (I2OI0I ) in Section 8) to get 
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Consequently, putting, 



(note that M„ has to be regular starting from certain n) and using ([5]) and 
(Ej) we finally get that 



which gives, by Proposition 6.1 of I JacobI (120101 ). that 

n--^M-\Tr,-To)^Af{0,A) 
in distribution so, from the (|5]), also 

n^M-(T„-To)^Ar(0,A) 

yielding 

n5(T„ - To) ^M{0,MAM). 
Now, by using the Continuous mapping theorem, we get 

n5(e„ - Go) ^ Ar{0,TMAMT^). 

where 



T 



allowing us to approximate 

0„ - 0o~Ar(O, TM„A„M„T^). 



Po 
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